With cloud storage gaining popularity, there is a need to conserve storage space. One technique for conserving storage space is through de-duplication of stored data files. Data files that consist of the same content are targets for de-duplication. However, identifying data files that consist of the same content is problematic. These and other shortcomings of the prior art are addressed by the present disclosure.